Automatic Extraction of Top-K Lists from Web
نویسندگان
چکیده
منابع مشابه
Automatic Extraction of Logical Web Lists
Recently, there has been increased interest in the extraction of structured data from the web (both “Surface” Web and“Hidden” Web). In particular, in this paper we focus on the automatic extraction of Web Lists. Although this task has been studied extensively, existing approaches are based on the assumption that lists are wholly contained in a Web page.They do not consider that many websites sp...
متن کاملHuman-Powered Top-k Lists
We propose an algorithm that obtains the top-k list of items out of a larger itemset, using human workers (e.g., through crowdsourcing) to perform comparisons among items. An example application is finding the best photographs in a large collection by asking humans to evaluate different photos. Our algorithm has to address several challenges: obtaining worker input has high latency; workers may...
متن کاملComparing top-k XML lists
Systems that produce ranked lists of results are abundant. For instance, Web search engines return ranked lists of Web pages. There has been work on distance measure for list permutations, like Kendall tau and Spearman’s Footrule, as well as extensions to handle top-k lists, which are more common in practice. In addition to ranking whole objects (e.g., Web pages), there is an increasing number ...
متن کاملEfficient Techniques for Crowdsourced Top-k Lists
We focus on the problem of obtaining top-k lists of items from larger itemsets, using human workers for doing comparisons among items. An example application is short-listing a large set of college applications using advanced students as workers. We describe novel efficient techniques and explore their tolerance to adversarial behavior and the tradeoffs among different measures of performance (...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IARJSET
سال: 2017
ISSN: 2393-8021
DOI: 10.17148/iarjset/nciarcse.2017.42